1 Introduction

This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code. This notebook replicates results in \(\textbf{New tools for network time series with an application to COVID-19 hospitalisations}\); see https://arxiv.org/abs/2312.00530, and produces further Corbit and R-Corbit plots examples.

2 Sample Corbit plots and GNAR simulations

Simulate realisations of differing length coming from the stationary global-\(\alpha\) GNAR model below

\[ \begin{equation} \label{eq:vector wise representation} \boldsymbol{X}_{t} = \sum_{k = 1}^{p} \left (\alpha_{k} \boldsymbol{X}_{t - k} + \sum_{r = 1}^{s_k} \beta_{kr} \boldsymbol{Z}_{r, t - k} \right ) + \boldsymbol{u}_{t} , \end{equation} \] We use the \(\texttt{fiveNet}\) (included in the CRAN \(\texttt{GNAR}\) package) network throughout this notebook for producing all relevant figures.

2.1 Producing GNAR simulations

We begin by loading the GNAR package into our session in \(\texttt{R}\).

  library(GNAR)
  source("/Users/danielsalnikov/Documents/PhD/my_stuff/papers/corbit_paper/R_scripts/R_scripts_revised/R-Corbit_Covid.R")
## 
## Attaching package: 'igraph'
## The following objects are masked from 'package:stats':
## 
##     decompose, spectrum
## The following object is masked from 'package:base':
## 
##     union
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
## Loading required package: viridisLite
## 
## Attaching package: 'ggpubr'
## The following object is masked from 'package:forecast':
## 
##     gghistogram

We simulate a realisation of length one-hundred by using the function \(\texttt{GNARsim}\). The code for producing the first simulation also shows how to compute the \(r\)-stage adjacency matrices and the weights matrix for the \(\texttt{fiveNet}\) network.

  n_sims = 100
  # Compute the r-stage adjacency matrices
  stages_tensor = get_k_stages_adjacency_tensor(
  as.matrix(GNARtoigraph(fiveNet)), 
  3)
  # Compute the weights-matrix for the fiveNet network
  W_normalised = weights_matrix(fiveNet, 3)

  
  # First Simulation GNAR(1, [1])
  sim1 <-  GNARsim(n = n_sims, net=fiveNet,
                 alphaParams = list(rep(0.47, 5)), 
                 betaParams = list(c(0.46)))

We proceed to compute a Corbit plot for the simulation produced above.

  m_lag = 20
  m_stage = 5
  corbit_plot(vts = sim1, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'cividis')
Figure 1: Corbit plot for a realisation of length 100 coming from a stationary global-alpha GNAR(1, [1]) process.

Figure 1: Corbit plot for a realisation of length 100 coming from a stationary global-alpha GNAR(1, [1]) process.

The Corbit plot above uses the \(\texttt{cividis}\) colour scale, however, users can specify any colour scale supported by the \(\texttt{viridis}\) package (all of which are colour-blind friendly). Note that the Corbit plot can quickly aid analysts for detecting seasonality and trend in an observed realisation of a (network) time series. We incorporate a seasonal term as a cosine wave of period four into simulation one and produce the new Corbit plot.

  m_lag = 20
  m_stage = 5
  sim1_seasonal = vapply(c(1:5), function(x) {sim1[, x] + 2 * (1 + cos(c(1:n_sims) * pi / 2)) }, rep(0, n_sims))
  corbit_plot(vts = sim1_seasonal, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'cividis')
Figure 2: Corbit plot for a realisation of length 100 coming from a seasonal global-alpha GNAR(1, [1]) process.

Figure 2: Corbit plot for a realisation of length 100 coming from a seasonal global-alpha GNAR(1, [1]) process.

The Corbit plot in Figure 2 successfully reflects the seasonal period every four lags. We incorporate a linear trend into Simulation 1 below, and analyse the resulting Corbit plot below.

  m_lag = 20
  m_stage = 5
  sim1_trend = vapply(c(1:5), function(x) {sim1[, x] + c(1:n_sims) }, rep(0, n_sims))
  corbit_plot(vts = sim1_trend, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'cividis')
Figure 3: Corbit plot for a realisation of length 100 coming from a global-alpha GNAR(1, [1]) process plus a linear trend.

Figure 3: Corbit plot for a realisation of length 100 coming from a global-alpha GNAR(1, [1]) process plus a linear trend.

Finally, we incorporate both a trend and seasonal component into Simulation 1, and analyse the resulting Corbit plot.

  m_lag = 20
  m_stage = 5
  sim1_trend_seasonal = vapply(c(1:5), function(x) {sim1[, x] + 2 * (1 + cos(c(1:n_sims) * pi / 2)) + c(1:n_sims) / 10 }, rep(0, n_sims))
  corbit_plot(vts = sim1_trend_seasonal, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'cividis')
Figure 3: Corbit plot for a realisation of length 100 coming from a global-alpha GNAR(1, [1]) process plus a linear trend.

Figure 3: Corbit plot for a realisation of length 100 coming from a global-alpha GNAR(1, [1]) process plus a linear trend.

2.2 Recovering model order

We simulate a realisation of length one-thousand coming from a stationary global-\(\alpha\) GNAR\((3, [2, 1, 0])\) process. Subsequently, we compute the mean value of the NACF and PNACF at each iteration.

  # Simulate a realisation of length 100 coming from a stationary GNAR(2, [1, 2]) 
  nsims_2 = 1000
  sim2 <- GNARsim(n = nsims_2, net=fiveNet,
                alphaParams = list(rep(0.12, 5), rep(0.18, 5)), 
                betaParams = list(c(0.21), c(-0.32, 0.12)))
  # Compute the corresponding Corbit plot for the PNACF
  corbit_plot(vts = sim2, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'turbo', partial = "yes")

We examine another model.

  # Simulate a realisation of length 100 coming from a stationary GNAR(2, [1, 2]) 
  nsims = 100
  sim2 <- GNARsim(n = nsims, net=fiveNet,
                alphaParams = list(rep(0.21, 5), rep(0.18, 5)), 
                betaParams = list(c(0.12), c(0.32, 0.12)))
  # Compute the corresponding Corbit plot for the PNACF
  corbit_plot(vts = sim2, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'viridis', partial = "yes")

We examine a basic GNAR model

  # Simulate a realisation of length 100 coming from a stationary GNAR(2, [1, 2]) 
  nsims = 100
  sim2 <- GNARsim(n = nsims, net=fiveNet,
                alphaParams = list(rep(0.35, 5), rep(0.12, 5)), 
                betaParams = list(c(0.24), c(0.17)))
  # Compute the corresponding Corbit plot for the PNACF
  corbit_plot(vts = sim2, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'viridis', partial = "yes")

2.3 Producing R-Corbit plots

R-Corbit plots allow quick comparison of network correlation structures for different time-slices and/or covariates (e.g., labels). We produce an example comparing three distinct GNAR simulations below.

# Initialise values
m_lag = 20
m_stage = 3
n_sims = 200
set.seed(2023)

# Produce GNAR simulations
sim1 <-  GNARsim(n = n_sims, net=fiveNet, alphaParams = list(c(0.1, 0.12, 0.16, 0.075, 0.21),
                                                                  c(0.12, 0.14, 0.15, 0.3, 0.22)), 
                        betaParams = list(c(0.16, 0.10), c(0.14, 0.11)))

sim2 <-  GNARsim(n = n_sims, net=fiveNet, alphaParams = list(rep(0.23, 5), rep(0.11, 5)), 
                 betaParams = list(c(0.21), c(0.12)))

sim3 <-  GNARsim(n = n_sims, net=fiveNet, alphaParams = list(rep(0.12, 5), rep(0.10, 5)), 
                 betaParams = list(c(0.25, 0.27), c(0.18)))

# Produce R-Corbot plots
R_corbit_plot(list(sim1, sim2, sim3), list(fiveNet), m_lag, m_stage, frame_names = c("GNAR(1, [2, 2]) simulation.", "GNAR(1, [1]) simulation.", "GNAR(1, [2]) simulation."), same_net = "yes")

R_corbit_plot(list(sim1, sim2, sim3), list(fiveNet), m_lag, m_stage, frame_names = c("GNAR(1, [2, 2]) simulation.", "GNAR(1, [1]) simulation.", "GNAR(1, [2]) simulation."), same_net = "yes", partial = "yes")

3 Analysing the COVID-19 NHS network time-series

We start by loading the NHS trust network time series (included in the \(\texttt{GNAR}\) package). The network time series is a realisation of length four-hundred-and-fifty-two (452) that corresponds to the logarithm of the daily number of patients occupying mechanical ventilation beds, i.e., \(\log (1 + Y_{i, t})\), where \(Y_{i, t}\) is the number of patients occupying mechanical ventilation beds in NHS trust \(i = 1, \dots, 140\) on day \(t = 1, \dots, 452\).

# Preliminary objects
# load the NHS trust network time series
covid_data = logMVbedMVC.vts

# Compute the weight matrix for the NHS trusts network
W_nhs_trust = weights_matrix(NHSTrustMVCAug120.net, 1)

We visualise the evolution of the series by plotting the sum of the total number of patients transferred to mechanical ventilation beds on each day. Further, we compute the mean and standard deviaiton up to day \(t\) for exploratory data analysis.

  sum_series = as.ts(rowSums(logMVbedMVC.vts))
plot(c(1:452), sum_series, 'l', xlab = 't', main = expression(paste('Plot of the Sum of all NHS time-series: ', Y[t] == sum(X["i, t"], i == 1, 140))), 
     ylab = expression(Y[t]), col = 'blue')

plot(c(1:452), sum_series / 140, 'l', xlab = 't', main = expression(paste('Plot of daily mean occupancy of all NHS time-series: ', Y[t] == sum(X["i, t"], i == 1, 140) / 140)), 
     ylab = expression(Y[t]), col = 'blue')

# compute mean and standard deviation up top day t
sum_series_mean = vapply(c(1:length(sum_series)), function(x) {mean(sum_series[1:x])}, 0)
sum_series_sd = vapply(c(1:length(sum_series)), function(x) {sd(sum_series[1:x])}, 0)

# plot mean and standard deviation
plot(c(1:452), sum_series_mean, 'l', xlab = 't', main = expression(paste('Plot of the mean sum of all NHS time-series: ', Y[t] == sum(X["i, t"], i == 1, 140))), 
     ylab = expression(mean(Y[t])), col = 'blue')

plot(c(1:452), sum_series_sd, 'l', xlab = 't', main = expression(paste('Plot of the sum sd of all NHS time-series: ', Y[t] == sum(X["i, t"], i == 1, 140))), 
     ylab = expression(sd(Y[t])), col = 'blue')

3.1 Model order selection

We analyse the network autocorrelation structure by analysing the Corbit plots below. First, we produce the NACF Corbit plot.

  corbit_plot(covid_data, NHSTrustMVCAug120.net, 20, 6)

Subsequently, we analyse the PNACF Corbit plot, which aids model selection and visualising dependence in the data.

  corbit_plot(covid_data, NHSTrustMVCAug120.net, 10, 6, partial="yes")

The Corbit plots above suggest that network autocorrelation cuts-off after the first lag, further, that one-stage neighbours appear to have a stronger correlation than higher-order \(r\)-stage neighbours. These Corbit plots suggest comparing the following models.

  # Create a table for storing model diagnostic statistics
  GNAR_model_comparison_table <- array(dim = c(6, 5))
  colnames(GNAR_model_comparison_table) <- c("No. of param. ", "  One-Step SE  ", "  Two-Step SE  ", "AIC", "BIC")

We compare GNAR fits for the time-steps selected below.

  covid_data_set = covid_data[1:447, ]
  test_set = covid_data[448:452, ]

The code for fitting a GNAR\((1, [1])\) is

  # GNAR(1, [1])
  # model order specification
  plag = 1
  stages = c(1)
  
  # fitting the model
  covid_fit <- GNARfit(covid_data_set, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
  
  # Number of parameters
  GNAR_model_comparison_table[1, 1] = length(covid_fit$mod$coefficients)
  GNAR_model_comparison_table[1, 4] = AIC(covid_fit)
  GNAR_model_comparison_table[1, 5] = BIC(covid_fit)
  
  # one-step ahead forecast
  one_step = predict(covid_fit)
  GNAR_model_comparison_table[1, 2] = sum((one_step - test_set[1, ])^2)
  
  # two-step ahead forecast
  new_data = rbind(covid_data_set, one_step)
  new_fit = GNARfit(new_data, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
  GNAR_model_comparison_table[1, 3] = sum((predict(new_fit)- test_set[2, ])^2)

The code for fitting a GNAR\((1, [6])\) is

  # GNAR(1, [6])
  # model order specification
  plag = 1
  stages = c(6)
  
  # fitting the model
  covid_fit <- GNARfit(covid_data_set, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
  
  # Number of parameters
  GNAR_model_comparison_table[2, 1] = length(covid_fit$mod$coefficients)
  GNAR_model_comparison_table[2, 4] = AIC(covid_fit)
  GNAR_model_comparison_table[2, 5] = BIC(covid_fit)
  
  # one-step ahead forecast
  one_step = predict(covid_fit)
  GNAR_model_comparison_table[2, 2] = sum((one_step - test_set[1, ])^2)
  
  # two-step ahead forecast
  new_data = rbind(covid_data_set, one_step)
  new_fit = GNARfit(new_data, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
  GNAR_model_comparison_table[2, 3] = sum((predict(new_fit)- test_set[2, ])^2)

The code for fitting a GNAR\((2, [6, 6])\) is

  # GNAR(2, [6, 6])
  # model order specification
  plag = 2
  stages = c(6, 6)
  
  # fitting the model
  covid_fit <- GNARfit(covid_data_set, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
  
  # Number of parameters
  GNAR_model_comparison_table[3, 1] = length(covid_fit$mod$coefficients)
  GNAR_model_comparison_table[3, 4] = AIC(covid_fit)
  GNAR_model_comparison_table[3, 5] = BIC(covid_fit)
  
  # one-step ahead forecast
  one_step = predict(covid_fit)
  GNAR_model_comparison_table[3, 2] = sum((one_step - test_set[1, ])^2)
  
  # two-step ahead forecast
  new_data = rbind(covid_data_set, one_step)
  new_fit = GNARfit(new_data, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
  GNAR_model_comparison_table[3, 3] = sum((predict(new_fit)- test_set[2, ])^2)

The code for fitting a GNAR\((6, [6^6])\) is

  # GNAR(6, [6, ..., 6])
  # model order specification
  plag = 6
  stages = rep(6, 6)
  
  # fitting the model
  covid_fit <- GNARfit(covid_data_set, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
  
  # Number of parameters
  GNAR_model_comparison_table[4, 1] = length(covid_fit$mod$coefficients)
  GNAR_model_comparison_table[4, 4] = AIC(covid_fit)
  GNAR_model_comparison_table[4, 5] = BIC(covid_fit)
  
  # one-step ahead forecast
  one_step = predict(covid_fit)
  GNAR_model_comparison_table[4, 2] = sum((one_step - test_set[1, ])^2)
  
  # two-step ahead forecast
  new_data = rbind(covid_data_set, one_step)
  new_fit = GNARfit(new_data, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
  GNAR_model_comparison_table[4, 3] = sum((predict(new_fit)- test_set[2, ])^2)

The code for fitting a GNAR\((1, [0])\) is

  # GNAR(1, [0])
  # model order specification
  plag = 1
  stages = c(0)
  
  # fitting the model
  covid_fit <- GNARfit(covid_data_set, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
  
  # Number of parameters
  GNAR_model_comparison_table[5, 1] = length(covid_fit$mod$coefficients)
  GNAR_model_comparison_table[5, 4] = AIC(covid_fit)
  GNAR_model_comparison_table[5, 5] = BIC(covid_fit)
  
  # one-step ahead forecast
  one_step = predict(covid_fit)
  GNAR_model_comparison_table[5, 2] = sum((one_step - test_set[1, ])^2)
  
  # two-step ahead forecast
  new_data = rbind(covid_data_set, one_step)
  new_fit = GNARfit(new_data, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
  GNAR_model_comparison_table[5, 3] = sum((predict(new_fit)- test_set[2, ])^2)

The code for fitting a GNAR\((10, [0])\) is

  # GNAR(10, [0])
  # model order specification
  plag = 10
  stages = rep(0, 10)
  
  # fitting the model
  covid_fit <- GNARfit(covid_data_set, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
  
  # Number of parameters
  GNAR_model_comparison_table[6, 1] = length(covid_fit$mod$coefficients)
  GNAR_model_comparison_table[6, 4] = AIC(covid_fit)
  GNAR_model_comparison_table[6, 5] = BIC(covid_fit)
  
  # one-step ahead forecast
  one_step = predict(covid_fit)
  GNAR_model_comparison_table[6, 2] = sum((one_step - test_set[1, ])^2)
  
  # two-step ahead forecast
  new_data = rbind(covid_data_set, one_step)
  new_fit = GNARfit(new_data, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
  GNAR_model_comparison_table[6, 3] = sum((predict(new_fit)- test_set[2, ])^2)

We compare the models by printing the following table.

  GNAR_model_comparison_table = round(GNAR_model_comparison_table, 3)
  
  rownames(GNAR_model_comparison_table) <- c("GNAR(1, [1])", "GNAR(1, [6])", "GNAR(2, [6, 6])", "GNAR(6, [6,:6])", "GNAR(1, [0])", "GNAR(10, [0])")
  
  knitr::kable(GNAR_model_comparison_table, "simple")
No. of param. One-Step SE Two-Step SE AIC BIC
GNAR(1, [1]) 2 5.088 8.808 -452.393 -452.375
GNAR(1, [6]) 7 5.064 8.799 -452.532 -452.468
GNAR(2, [6, 6]) 14 5.166 8.218 -458.883 -458.755
GNAR(6, [6,:6]) 42 5.238 9.882 -467.350 -466.964
GNAR(1, [0]) 1 5.030 8.845 -451.739 -451.730
GNAR(10, [0]) 10 5.027 8.654 -472.229 -472.137

3.2 Comparison with VAR models

We compare a GNAR\((1, [1])\) with VAR and sparse VAR models. We begin by loading the required libraries.

  library(forecast)
  library(vars)
## Loading required package: MASS
## Loading required package: strucchange
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## Loading required package: sandwich
## Loading required package: urca
## Loading required package: lmtest
  library(sparsevar)
## 
## Attaching package: 'sparsevar'
## The following object is masked from 'package:forecast':
## 
##     accuracy

We perform one-step ahead prediction ten times. The starting time-step and error vectors are initialised below.

  # number of one-step ahead predictions
  # dummt counter variable
  k = 0
  t_steps = 10
  
  # starting point
  covid_data = logMVbedMVC.vts
  initial_start = nrow(covid_data) - t_steps
  decimal_digits = 3
  
  # AR model prediction error
  ar_pe = rep(0, t_steps)
  
  # global-alpha GNAR model prediction error
  gnar_pe = rep(0, t_steps)
  # local-alpha GNAR model prediction error
  gnar_no_global_pe = rep(0, t_steps)
  
  # VAR prediction error
  var_pe = rep(0, t_steps)
  
  # restricted VAR prediction error
  res_var_pe = rep(0, t_steps)
  
  # sparse VAR prediction error
  sparse_var_pe = rep(0, t_steps)

We perform one-ahead forecasting ten times and save the squared error.

  for (i in 1:t_steps) {
  # initialise the comparison table
  output <- matrix(rep(0, 18), nrow = 6, ncol = 3)
  train_end = initial_start + k
  test_point = train_end + 1
  
# We compare the GNAR models in this script
# 140 AR(1) models
  arforecast <- apply(covid_data[1:train_end, ], 2, function(x){
    forecast(auto.arima(x, d = 0, D = 0, max.p = 1, max.q = 0,
             max.P = 0, max.Q = 0, stationary = TRUE, seasonal = FALSE, ic = 'bic',
             allowmean = FALSE, allowdrift = FALSE, trace = FALSE), h = 1)$mean 
  })
  output[5, 2] = "$140$"
  output[5, 1] = "AR(1)"
  output[5, 3] = round(sum((arforecast - covid_data[test_point, ])^2), digits = decimal_digits)
  ar_pe[i] = output[5, 3]

# GNAR(1, [1]) global-alpha fit
  covid_fit <- GNARfit(covid_data[1:train_end, ], net = NHSTrustMVCAug120.net, alphaOrder = 1, betaOrder = rep(1, 1))
  output[1, 2] = 2
  output[1, 3] = round(sum((predict(covid_fit) - covid_data[test_point, ])^2), digits = decimal_digits)
  gnar_pe[i] = output[1, 3]
  output[1, 1] = "GNAR(1, [1])"

# GNAR(1, [1]) local-alpha fit
  covid_fit <- GNARfit(covid_data[1:train_end, ], net = NHSTrustMVCAug120.net, alphaOrder = 1, betaOrder = rep(1, 1), globalalpha=FALSE)
  output[6, 2] = 141
  output[6, 3] = round(sum((predict(covid_fit) - covid_data[test_point, ])^2), digits = decimal_digits)
  gnar_no_global_pe[i] = output[6, 3]
  output[6, 1] = "GNAR(1, [1])*"


# Restricted VAR(1) model
  varforecast <- predict(restrict(VAR(covid_data[1:train_end, ], p = 1, type = 'none')), n.ahead = 1)
  get_var_fcst <- function(x){return (x[1])}
  var_point_fcst <- unlist(lapply(varforecast$fcst, get_var_fcst))
  output[3, 3] = round(sum((var_point_fcst - covid_data[test_point, ])^2), digits = decimal_digits)
  res_var_pe[i] = output[3, 3]
  active = 0.0
  aux = varforecast$model$varresult
  for (ii in 1:ncol(covid_data)) {
    for (j in 1:length(aux[[ii]][[1]])){
      if (abs(aux[[ii]][[1]][j]) > 0.0) {
        active = active + 1
      }
    }
  }
  output[3, 1] = "Res. VAR(1)"
  output[3, 2] = active


# Classic VAR(1) model
  varforecast <- predict(VAR(covid_data[1:train_end, ], p = 1, type = 'none'), n.ahead = 1)
  get_var_fcst <- function(x){return (x[1])}
  var_point_fcst <- unlist(lapply(varforecast$fcst, get_var_fcst))
  output[2, 2] = "$140^2$"
  output[2, 3] = round(sum((var_point_fcst - covid_data[test_point, ])^2), digits = decimal_digits)
  var_pe[i] = output[2, 3]
  output[2, 1] = "VAR(1)"

# Sparse VAR(1) model
  sparse_var_forecast <- fitVAR(covid_data[1:train_end, ], p = 1)
  xhat = sparse_var_forecast$A[[1]] %*% covid_data[train_end, ]
  output[4, 3] = round(sum((xhat - covid_data[test_point, ])^2), digits = decimal_digits)
  sparse_var_pe[i] = output[4, 3]
  b = sparse_var_forecast$A[[1]]
  b[abs(b) > 0.0] = 1
  output[4, 2] = sum(rowSums(b))
  output[4, 1] = "Sparse VAR(1)"
##########################
  # stack all ten tables
  colnames(output) <- c("Model", "Active Parameters", "One-Step SPE")
  print(output)
  cat("\n############################################# \n#############################################\n")
  if (i == 1) {
    out_data = output
  } else {
    out_data = rbind(out_data, output)
  }
  k = k + 1
  }
##      Model           Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])"  "2"               "5.441"     
## [2,] "VAR(1)"        "$140^2$"         "11.613"    
## [3,] "Res. VAR(1)"   "3821"            "9.902"     
## [4,] "Sparse VAR(1)" "2759"            "10.109"    
## [5,] "AR(1)"         "$140$"           "82.055"    
## [6,] "GNAR(1, [1])*" "141"             "5.324"     
## 
## ############################################# 
## #############################################
##      Model           Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])"  "2"               "7.86"      
## [2,] "VAR(1)"        "$140^2$"         "12.804"    
## [3,] "Res. VAR(1)"   "3712"            "11.504"    
## [4,] "Sparse VAR(1)" "2992"            "11.533"    
## [5,] "AR(1)"         "$140$"           "78.748"    
## [6,] "GNAR(1, [1])*" "141"             "7.833"     
## 
## ############################################# 
## #############################################
##      Model           Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])"  "2"               "10.286"    
## [2,] "VAR(1)"        "$140^2$"         "12.52"     
## [3,] "Res. VAR(1)"   "3730"            "11.605"    
## [4,] "Sparse VAR(1)" "3033"            "13.008"    
## [5,] "AR(1)"         "$140$"           "82.571"    
## [6,] "GNAR(1, [1])*" "141"             "10.173"    
## 
## ############################################# 
## #############################################
##      Model           Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])"  "2"               "12.953"    
## [2,] "VAR(1)"        "$140^2$"         "17.762"    
## [3,] "Res. VAR(1)"   "3812"            "14.472"    
## [4,] "Sparse VAR(1)" "3283"            "15.057"    
## [5,] "AR(1)"         "$140$"           "86.781"    
## [6,] "GNAR(1, [1])*" "141"             "12.676"    
## 
## ############################################# 
## #############################################
##      Model           Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])"  "2"               "7.525"     
## [2,] "VAR(1)"        "$140^2$"         "13.82"     
## [3,] "Res. VAR(1)"   "3761"            "10.815"    
## [4,] "Sparse VAR(1)" "3315"            "11.995"    
## [5,] "AR(1)"         "$140$"           "89.097"    
## [6,] "GNAR(1, [1])*" "141"             "7.431"     
## 
## ############################################# 
## #############################################
##      Model           Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])"  "2"               "5.088"     
## [2,] "VAR(1)"        "$140^2$"         "12.504"    
## [3,] "Res. VAR(1)"   "3830"            "9.976"     
## [4,] "Sparse VAR(1)" "3284"            "9.863"     
## [5,] "AR(1)"         "$140$"           "88.181"    
## [6,] "GNAR(1, [1])*" "141"             "4.989"     
## 
## ############################################# 
## #############################################
##      Model           Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])"  "2"               "4.44"      
## [2,] "VAR(1)"        "$140^2$"         "7.978"     
## [3,] "Res. VAR(1)"   "3738"            "7.716"     
## [4,] "Sparse VAR(1)" "3030"            "7.918"     
## [5,] "AR(1)"         "$140$"           "91.004"    
## [6,] "GNAR(1, [1])*" "141"             "4.466"     
## 
## ############################################# 
## #############################################
##      Model           Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])"  "2"               "3.061"     
## [2,] "VAR(1)"        "$140^2$"         "7.196"     
## [3,] "Res. VAR(1)"   "3759"            "5.888"     
## [4,] "Sparse VAR(1)" "3017"            "6.842"     
## [5,] "AR(1)"         "$140$"           "88.066"    
## [6,] "GNAR(1, [1])*" "141"             "3.089"     
## 
## ############################################# 
## #############################################
##      Model           Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])"  "2"               "8.868"     
## [2,] "VAR(1)"        "$140^2$"         "10.869"    
## [3,] "Res. VAR(1)"   "3772"            "10.999"    
## [4,] "Sparse VAR(1)" "3298"            "11.773"    
## [5,] "AR(1)"         "$140$"           "91.151"    
## [6,] "GNAR(1, [1])*" "141"             "8.607"     
## 
## ############################################# 
## #############################################
##      Model           Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])"  "2"               "8.623"     
## [2,] "VAR(1)"        "$140^2$"         "15.693"    
## [3,] "Res. VAR(1)"   "3794"            "13.258"    
## [4,] "Sparse VAR(1)" "3299"            "15.715"    
## [5,] "AR(1)"         "$140$"           "96.361"    
## [6,] "GNAR(1, [1])*" "141"             "8.546"     
## 
## ############################################# 
## #############################################
  comp_results = cbind(as.numeric(gnar_pe), as.numeric(sparse_var_pe), as.numeric(res_var_pe), as.numeric(var_pe), as.numeric(ar_pe), as.numeric(gnar_no_global_pe))
  print(colMeans(comp_results))
## [1]  7.4145 11.3813 10.6135 12.2759 87.4015  7.3134
  aux = out_data[out_data[, 1] == 'Sparse VAR(1)']
  print(mean(as.numeric(aux[11:20])))
## [1] 3131
  # 3089
  aux = out_data[out_data[, 1] == 'Res. VAR(1)']
  print(mean(as.numeric(aux[11:20])))
## [1] 3772.9
  # 3773
  all_summary = matrix(rep(0, 18), nrow = 6, ncol = 3)
  all_summary[, 1] = colMeans(comp_results)
  all_summary[, 2] = c(2, 2962, 3767, 19600, 140, 141)
  for (j in 1:ncol(comp_results)) {
    all_summary[j, 3] = sd(comp_results[, j])
  }
  knitr::kable(all_summary, "simple")
7.4145 2 2.977923
11.3813 2962 2.828714
10.6135 3767 2.482946
12.2759 19600 3.184070
87.4015 140 5.146954
7.3134 141 2.900672

3.3 Comparison with CAR model

For completeness we also compare GNAR with a conditional autoregressive temporal autoregressive (CARar) model. Mean squared prediction error (MSPE) comparisons, including standard deviation (sd. MSPE) follow in the Table below. Interestingly, both models perform similarly. It turns out that CARar models can be seen as GNAR models constrained to one-stage neighbourhood regressions. Computation for CARar requires expensive MCMC calls, thus, for brevity we summarise the results below; see https://github.com/dansal182/new_tools_for_network_time_series_with_application_to_COVID_19_hospitalisations for the experiments code.

GNAR(1, [1]) sp. VAR res. VAR VAR CARar GNARc Naive
MSPE 7.40 11.4 10.60 12.30 7.5 7.20 7.60
sd. MSPE 2.98 2.8 2.48 3.18 3.1 2.82 3.12
No. Parameters 2.00 3089.0 3773.00 19600.00 5.0 2.00 0.00

Above, GNARc denotes GNAR model with centred columns, i.e., we subtract the column mean before fitting the model. We also remark the difference in execution time (in seconds):

Total time Average time
GNAR(1, [1]) 6.2 0.62
CARar(1) 7821.9 782.19

Execution time comparison between global-\(\alpha\) GNAR models and CARar models. The total execution time is the time computing one-step ahead forecasts ten times (i.e., the forecasts used for producing MSPE Table) takes in seconds. The average time is the total time divided by ten.

3.4 Analysing node impact on network autocorrelation

We analyse the impact individual nodes have on network autocorrelation for our choice of model order. Further, we study the relative importance each node has in pair-wise node regressions, and compare the one-lag cross-correlation matrix with a heat map obtained from Theorem 2, this comparison shows the self-similar clustering behaviour for pair-wise correlations.

We start by defining the different waves of the pandemic and looking at a R-Corbit plot of the realised network autocorrelation.

  covid_data = logMVbedMVC.vts
  W_nhs_network = weights_matrix(NHSTrustMVCAug120.net, 6)


  first_wave = covid_data[1:100, ]
  gap = covid_data[101:177, ]
  second_wave = covid_data[178:447, ]
  time_slices = c('First Wave: 04-2020 -> 07-2020', 'Gap: 07-2020 -> 09-2020', 'Second Wave: 09-2020 -> 07-2021')
  R_corbit_plot(list(first_wave, gap, second_wave), list(NHSTrustMVCAug120.net), 10, 6, weight_matrices = list(W_nhs_network), same_net = 'yes', 
            frame_names = time_slices)

The PNACF R-Corbit plot is produced with the code below.

  R_corbit_plot(list(first_wave, gap, second_wave), list(NHSTrustMVCAug120.net), 10, 6, weight_matrices = list(W_nhs_network), same_net = 'yes', 
            frame_names = time_slices, partial = 'yes')

We explore the importance individual nodes have for predicting each other with the heat-map below. Note that it is not symmetric, the yellow colour highlights the strongest pair-wise realations according to formula (18) in \(\textbf{New tools for network time series with an application to COVID-19 hospitalisations}\).

 active_node_plot(second_wave, NHSTrustMVCAug120.net, 1, c(1))

We continue by looking at the impact each node has on the total network autocorrelation (i.e., the column in \(\mathbf{W}\) that dominates autocorrelation). We do this by producing the node relevance hierarchy and plot, which is computed using (17).

  node_relevance_plot(NHSTrustMVCAug120.net, 1, colnames(covid_data), 1)
## [[1]]

## 
## [[2]]
##     Node Relevance
## 1    RGP 0.2388075
## 2    REF 0.2590320
## 3    RNN 0.3423592
## 4    RM1 0.3481702
## 5    RJL 0.3482829
## 6    RWA 0.4361727
## 7    RVV 0.4489792
## 8    RXC 0.4656740
## 9    RLQ 0.4718158
## 10   RCX 0.4981928
## 11   R1F 0.5329288
## 12   RTF 0.5451035
## 13   RTD 0.5607670
## 14   RWD 0.5672991
## 15   RXH 0.5723261
## 16   R0B 0.5776501
## 17   RR7 0.5914945
## 18   RBD 0.6028521
## 19   RWP 0.6045146
## 20   R0D 0.6237309
## 21   RDY 0.6300686
## 22   RBL 0.6343268
## 23   RVY 0.6360888
## 24   RYR 0.6388915
## 25   RBZ 0.6407169
## 26   REN 0.6555618
## 27   RXW 0.6644188
## 28   RTQ 0.6657547
## 29   RXL 0.6672037
## 30   RD1 0.6738285
## 31   RAJ 0.6772388
## 32   RA7 0.6775857
## 33   REM 0.6792473
## 34   RNZ 0.6839019
## 35   RQ3 0.6840216
## 36   RRK 0.6843320
## 37   RET 0.6886182
## 38   RRJ 0.6945536
## 39   RVJ 0.6959880
## 40   RKB 0.6985588
## 41   RHU 0.7060903
## 42   RJR 0.7124531
## 43   RWF 0.7136386
## 44   RH8 0.7216425
## 45   RLT 0.7233334
## 46   RBS 0.7266070
## 47   RJC 0.7274836
## 48   RBQ 0.7277001
## 49   RWE 0.7277560
## 50   RXN 0.7293355
## 51   RDE 0.7317239
## 52   RGN 0.7323235
## 53   RNA 0.7341533
## 54   RPA 0.7402341
## 55   RTR 0.7405527
## 56   RK9 0.7428719
## 57   RXK 0.7429187
## 58   RVW 0.7525139
## 59   RGR 0.7581291
## 60   RBN 0.7648500
## 61   RP5 0.7679661
## 62   RL4 0.7741657
## 63   RHM 0.7741876
## 64   RTX 0.7802541
## 65   RXE 0.7804340
## 66   RFR 0.7804340
## 67   RRF 0.7837088
## 68   RA9 0.7865553
## 69   RH5 0.7924808
## 70   RN7 0.7947044
## 71   RNQ 0.7962871
## 72   RP1 0.7974614
## 73   RBK 0.7993277
## 74   RXP 0.8151463
## 75   RF4 0.8178328
## 76   RTP 0.8213661
## 77   RA4 0.8273467
## 78   RXR 0.8290119
## 79   RJ2 0.8296417
## 80   RJ6 0.8388167
## 81   RCB 0.8419626
## 82   RMC 0.8420556
## 83   RWW 0.8421183
## 84   R1H 0.8458930
## 85   RJ1 0.8477358
## 86   RQX 0.8503438
## 87   RAP 0.8514368
## 88   RP4 0.8529093
## 89   RJZ 0.8534630
## 90   RRV 0.8548508
## 91   RTK 0.8587055
## 92   RJ7 0.8592502
## 93   RKE 0.8602540
## 94   RAL 0.8605143
## 95   RK5 0.8607855
## 96   RNS 0.8640082
## 97   RX1 0.8650822
## 98   RVR 0.8651983
## 99   RPY 0.8655824
## 100  RQM 0.8663747
## 101  RYJ 0.8680895
## 102  RWH 0.8682384
## 103  RT1 0.8695644
## 104  RR8 0.8723391
## 105  RGT 0.8759949
## 106  RGM 0.8763338
## 107  RAX 0.8783572
## 108  RC9 0.8839073
## 109  RXF 0.8857286
## 110  RA2 0.8887197
## 111  RAN 0.8925245
## 112  RM3 0.8945031
## 113  RQW 0.8950113
## 114  R1K 0.8981421
## 115  RW6 0.8999965
## 116  RAE 0.9023903
## 117  RAS 0.9059743
## 118  RDU 0.9060968
## 119  RTG 0.9105164
## 120  RWG 0.9174765
## 121  RN5 0.9209021
## 122  RHQ 0.9258074
## 123  RBT 0.9285356
## 124  RD8 0.9324912
## 125  RWY 0.9352884
## 126  RCF 0.9354989
## 127  RFF 0.9416439
## 128  RCU 0.9437173
## 129  R0A 0.9490081
## 130  RFS 0.9523243
## 131  RTH 0.9524462
## 132  RNU 0.9609052
## 133  RJE 0.9688577
## 134  RN3 0.9700914
## 135  RMP 0.9724912
## 136  RCD 0.9729392
## 137  RHW 0.9736950
## 138  RWJ 0.9790502
## 139  RJN 0.9880315
## 140  RXQ 1.0000000

We compare the observed one-lag cross-correlation and the structure Theorem 2 suggests in the plots below. Note that the neighbourhood and self-similarity suggests that outbreaks are mostly dependent on direct neighbours, i.e., the virus cannot reach nodes by jumps it must traverse all NHS trusts (i.e., nodes) in a path between two places for instigating an outbreak.

# larger r-stage changes the correlation structure
  max_r_stage = 1
  local_relevance_plot(NHSTrustMVCAug120.net, max_r_stage)

  cross_correlation_plot(1, covid_data)

All three plots, except for the node relevance one, have the following column and row names.

  dummy_vec = vapply(c(1:140), function(x) {cat(paste0(x, ": ", colnames(covid_data)[x]), "\n"); return (0)}, 0)
## 1: RCF 
## 2: RBS 
## 3: RTK 
## 4: RF4 
## 5: RFF 
## 6: R1H 
## 7: RC9 
## 8: RQ3 
## 9: RXL 
## 10: RMC 
## 11: RAE 
## 12: RXH 
## 13: RXQ 
## 14: RWY 
## 15: RGT 
## 16: RT1 
## 17: RQM 
## 18: RFS 
## 19: RJR 
## 20: RXP 
## 21: RJ6 
## 22: RN7 
## 23: RP5 
## 24: RBD 
## 25: RDY 
## 26: RJN 
## 27: RVV 
## 28: RXR 
## 29: RDE 
## 30: RXC 
## 31: RWH 
## 32: RVR 
## 33: RDU 
## 34: RR7 
## 35: RLT 
## 36: RTQ 
## 37: RP4 
## 38: RN3 
## 39: RJ1 
## 40: RN5 
## 41: RCD 
## 42: RQX 
## 43: RWA 
## 44: RYJ 
## 45: R1F 
## 46: RGP 
## 47: RNQ 
## 48: RJZ 
## 49: RAX 
## 50: RXN 
## 51: RR8 
## 52: RJ2 
## 53: RBQ 
## 54: REM 
## 55: R1K 
## 56: RWF 
## 57: R0A 
## 58: RPA 
## 59: RBT 
## 60: RXF 
## 61: RAJ 
## 62: RD8 
## 63: RM1 
## 64: RVJ 
## 65: RNN 
## 66: RAP 
## 67: RVW 
## 68: RGN 
## 69: RNS 
## 70: RP1 
## 71: RBZ 
## 72: RJL 
## 73: RTF 
## 74: RX1 
## 75: RNU 
## 76: RTH 
## 77: RW6 
## 78: RHU 
## 79: RXE 
## 80: RHW 
## 81: REF 
## 82: RH8 
## 83: RAL 
## 84: RAN 
## 85: RGM 
## 86: RA2 
## 87: RD1 
## 88: RM3 
## 89: RNZ 
## 90: RXK 
## 91: RCU 
## 92: RHQ 
## 93: RK5 
## 94: RH5 
## 95: RTR 
## 96: R0B 
## 97: RJC 
## 98: RVY 
## 99: RJ7 
## 100: RBN 
## 101: RWJ 
## 102: RTP 
## 103: RMP 
## 104: REN 
## 105: RNA 
## 106: RAS 
## 107: RTD 
## 108: RQW 
## 109: RCX 
## 110: RFR 
## 111: RPY 
## 112: RRJ 
## 113: RL4 
## 114: RXW 
## 115: RET 
## 116: RA9 
## 117: RWD 
## 118: RRV 
## 119: RHM 
## 120: RRK 
## 121: RA7 
## 122: RKB 
## 123: R0D 
## 124: RK9 
## 125: RTG 
## 126: RWE 
## 127: RTX 
## 128: RJE 
## 129: RBK 
## 130: RWW 
## 131: RWG 
## 132: RGR 
## 133: RYR 
## 134: RKE 
## 135: RBL 
## 136: RWP 
## 137: RRF 
## 138: RLQ 
## 139: RA4 
## 140: RCB

We plot the network and colour the nodes according to node relevance given by (18).

  library(igraph)
library(GNAR)
  load("/Users/danielsalnikov/Documents/PhD/my_stuff/papers/corbit_paper/R_scripts/R_scripts_revised/node_relevance_plot_data.RData")
  
  nhs_adjacency = as.matrix(GNARtoigraph(NHSTrustMVCAug120.net))
  colnames(nhs_adjacency) = colnames(logMVbedMVC.vts)
  rownames(nhs_adjacency) = colnames(logMVbedMVC.vts)
  
  nhs_net_igraph = graph_from_adjacency_matrix(nhs_adjacency, "undirected")
  plot(nhs_net_igraph)

  V(nhs_net_igraph)$relevance <- node_data_aux[, 2]

  V(nhs_net_igraph)$color <- node_data_aux[, 3]

  V(nhs_net_igraph)$label.color <- rep("white", 140)
  plot(nhs_net_igraph)